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ARE  MASS  EXTINCTIONS  REALLY  PERIODIC? 


Sheldon  M.  Ross 

Department  of  Industrial  Engineering  and  Operations  Research 
University  of  California,  Berkeley 


Abstract  -  It  is  argued  that  the  analysis  of  family  extinction  data  that 
resulted  in  the  claim  of  a  26^-Myr  periodicity  of  mass  extinctions  was  flawed 
in  that  it  did  not  allow  for  the  possibility  of  a  symmetric  random  walk  model, 
which  is  shown  to  be  perfectly  consistent  with  the  data. 
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1.  INTRODUCTION  AND  SUMMARY 

In  [1]  Raup  and  Sepkoski  analyzed  data  relating  to  the  proportion  of 
families  that  became  extinct  in  each  of  39  successive  time  periods  of 
(average)  length  6.2  million  years.  Stating  that  the  data  Indicated  a 
periodicity  of  mass  extinctions,  they  then  presented  a  statistical  analysis 
which  they  claim  verified  the  above. 

In  Section  2  of  this  note  we  point  out  that  there  was  a  basic  flaw  in  the 
statistical  analysis  given  in  [1]  since  it  did  not  allow  for  the  possibi  lity 
that  the  data  was  generated  by  a  random  walk  model.  In  Section  3  we  show  that 
the  random  walk  model  is  perfectly  consistent  with  the  data  presented  in  [1], 

2.  A  CRITIQUE  OF  THE  STATISTICAL  ANALYSIS  IN  m 

In  verifying  that  the  data  implied  a  periodicity  in  mass  extinctions, 

Raup  and  Sepkoski  computed  the  value  of  a  statistic  which  is  Indicative  of 
data  periodicity,  and  then  compared  this  value  with  its  set  of  possible  values 
under  all  permutations  of  the  39  data  values.  However,  such  a  permutation 
test  Is  only  meaningful  if  the  set  of  alternative  hypotheses  are  such  that, 
conditional  on  the  set  of  data  values,  all  39!  possible  orderings  are  equally 
likely.  That  is,  such  a  test  is  meaningful  if  one  is  testing  periodicity 
against  the  alternative  hypothesis  that  the  data  values  constitute  a  random 
sample  from  some  arbitrary  probability  distribution.  It  is  not  a  meaningful 
test  if  the  alternative  is  that  the  incremental  changes  of  the  data  constitute 


a  random  sample  —  the  so-called  random  walk  model.  Indeed,  as  the  random 
walk  model  appears  to  be  the  usually  assumed  model  for  extinction  (as 
mentioned  in  [2],  [3],  [4].  and  even  in  [1]  since  a  random  walk  model  would 
arise  from  a  standard  birth-death  model  when  analyzed  in  discrete  time)  this 
appears  to  be  a  serious  oversight. 

Ve  will  now  present  a  nonpar ame trie  analysis  which  indicates  that  the 
random  walk  model  is  perfectly  capable  of  explaining  the  perceived  periodicity 
of  mass  extinctions.  Indeed,  as  mentioned  in  [2]  and  [3]  the  appearance  of 
such  a  periodicity  is  quite  possibly  solely  a  function  of  the  definition  of  an 
extinction  peak. 

3.  ANALYZING  THE  VIABILITY  OF  THE  SYMMETRIC  RANDOM  WALK  MODEL 

Raup  and  Sepkoskl  defined  an  extinction  peak  to  occur  at  time  period  1  if 
DJ_1<D1>DI+1  where  Dj  represents  the  data  value  for  period  i.  That  is.  an 
extinction  peak  occurs  each  time  a  data  value  is  larger  than  both  its 
neighbors.  Suppose  now  that  the  data  was  actually  generated  by  a  random  walk 
mechanism  so  that  each  value  had  probability  1/2  of  being  greater  and 
probability  1/2  of  being  less  than  its  predecessor.  As  noted  in  [2]  and  [3] 
this  would  imply  that  any  given  time  period  will  constitute  an  extinction  peak 
with  probability  1/4,  and  so.  on  average,  such  peaks  would  occur  one-fourth  of 
the  time.  However,  as  also  noted  in  [2],  there  is  some  variance  involved  and 
so  the  above  by  Itself  does  not  indicate  that  the  symmetric  random  walk  model 
is  consistent  with  the  data  of  [1].  Ve  will  now  show  that  this  is  the  case. 

To  test  the  symmetric  random  walk  hypothesis  note  first  that  it  Implies 
that  the  successive  times  between  extinction  peaks  are  Independent  random 


variables  with  the  common  distribution 
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P{X=k}  =  .  k*2  (1) 

2k 

where  X  denotes  the  number  of  time  periods  between  peaks  —  for  instance,  X 
will  equal  2  if  there  are  peaks  at  periods  r  and  r+2.  Equation  (1)  is 

verified  by  noting  that  X  will  equal  k  if  for  some  i,  i=0 . k-2.  the  first  i 

incremental  values  following  a  particular  extinction  trio  are  negative,  the 
next  k-l-i  are  positive,  and  the  next  one  is  negative.  The  mean  and  variance 
of  X  are 

E[X]  *  Var(X)  =  4 

Ve  will  now  test  the  symmetric  random  walk  hypothesis  by  performing  a 
goodness  of  fit  test  on  the  times  between  successive  extinction  peaks  for  the 
data  in  [1].  As  a  prelude,  say  that  an  interpeak  time  X  falls  in  region  i  if 
X=i+1  .1=1. 2. 3. 4  and  in  region  5  if  X£6,  and  note  that 

P1=P{X=2}=.25 
P2=P{X=3}=.25 
P3=P{X=4}=. 1875 
P4=P{X=5}=.125 
P5=P{X*6}=.1875 

Now  the  11  values  of  the  time  periods  between  successive  extinction  peaks 
given  in  [1]  are  3. 4, 4, 2, 2, 3. 3. 4, 5, 5, 4.  Hence,  letting  Nj  denote  the 
number  falling  in  region  1,  we  have  that 

Nj=2,  N2=3,  N3=4.  N4=2.  N5=0 

The  value  of  the  goodness  of  fit  test  statistic  is  therefore 

5  2 

T  =  I  (Nj-llpjJVllpj  =  4.394 
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As  It  is  not  apparent  that  the  sample  size  11  is  large  enough  to  suppose  that 
T  will,  under  the  symmetric  random  walk  hypothesis,  have  approximately  a 
chi-square  distribution  with  4  degrees  of  freedom,  the  probability  that  T 
would  have  been  as  large  as  4.394  when  the  distribution  is  given  by  (1)  was 
determined  by  a  simulation  study  using  10,000  runs.  The  results  of  this 
simulation  were  that  this  probability  (commonly  referred  to  as  the  p-value)  is 
equal  to  .3438.  (The  chi-square  approximation  would  have  yielded  the  value 
.3547).  Therefore,  a  deviation  from  the  random  walk  fit  as  large  as  observed 
would  be  expected  to  occur  35X  of  the  time  when  the  random  walk  model  is 
correct;  thus  showing  that  the  symmetric  random  walk  hypothesis  is  perfectly 
consistent  with  the  data  of  [1]. 

Remark:  The  distribution  of  times  between  peaks  given  by  (1)  was  also 
Independently  noted  in  [5]  where  goodness  of  fit  tests  relating  to  the  mean 
and  variance  (but  not  the  distribution  function  as  done  above)  were  presented. 
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